Self-Supervision for Reinforcement Learning

نویسندگان

  • Parsa Mahmoudieh
  • Trevor Darrell
  • Sergey Levine
چکیده

Reinforcement learning optimizes policies for expected cumulative reward. Need the supervision be so narrow? Reward is delayed and sparse for many tasks, making it a difficult and impoverished signal for end-to-end optimization. To augment reward, we consider a range of self-supervised tasks that incorporate states, actions, and successors to provide auxiliary losses. These losses offer ubiquitous and instantaneous supervision for representation learning even in the absence of reward. While current results show that learning from reward alone is feasible, pure reinforcement learning methods are constrained by computational and data efficiency issues that can be remedied by auxiliary losses. Self-supervised pretraining and joint optimization improve the data efficiency and policy returns of end-to-end reinforcement learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Loss is its own Reward: Self-Supervision for Reinforcement Learning

Reinforcement learning optimizes policies for expected cumulative reward. Need the supervision be so narrow? Reward is delayed and sparse for many tasks, making it a difficult and impoverished signal for end-to-end optimization. To augment reward, we consider a range of selfsupervised tasks that incorporate states, actions, and successors to provide auxiliary losses. These losses offer ubiquito...

متن کامل

SUPERVISION OF MIDWIVES AT THE UNIVERSITY OF HERTFORDSHIRE Supervision of Midwives is a statutory responsibility which provides a “mechanism for support and guidance to every midwife practising in the United Kingdom” (Nursing and Midwifery Council, NMC

Supervision of Midwives is a statutory responsibility which provides a “mechanism for support and guidance to every midwife practising in the United Kingdom” (Nursing and Midwifery Council, NMC and The Local Supervising Authority Midwifery Officers National Forum, LSAMONF, 2008). To become a SoM requires a midwife to be nominated by her peers and to undertake a course at Masters Level over two ...

متن کامل

Efficient Multi-Agent Reinforcement Learning through Automated Supervision (Short Paper)

Multi-Agent Reinforcement Learning (MARL) algorithms suffer from slow convergence and even divergence, especially in large-scale systems. In this work, we develop a supervision framework to speed up the convergence of MARL algorithms in a network of agents. The framework defines an organizational structure for automated supervision and a communication protocol for exchanging information between...

متن کامل

Efficient multi-agent reinforcement learning through automated supervision

Multi-Agent Reinforcement Learning (MARL) algorithms suffer from slow convergence and even divergence, especially in large-scale systems. In this work, we develop a supervision framework to speed up the convergence of MARL algorithms in a network of agents. The framework defines an organizational structure for automated supervision and a communication protocol for exchanging information between...

متن کامل

A Comparative Study of Self-Supervision and the Self-Efficacy of Iranian EFL Teachers and Those of Intermediate Adult Learners

The present study was conducted to examine the relationship between the self-supervision and the self-efficacy of Iranian EFL teachers and also the relationship between the self-supervision and the self-efficacy of intermediate adult learners individually. To this end, 40 EFL teachers and 55 intermediate adult learners were selected from two branches of Kish Language Institute. In this study, “...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017